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maps, and which I’educe back to die original information normally contiunccl in a contingency 
table. 
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TECHNICAL MEMORANDUM X-73347 


EVALUATION CRITERIA FOR SOFTWARE CLASSIFICATION 
INVENTORIES, ACCURACIES. AND MAPS 

I. INTRODUCTION 


Considerable emphasis is now being given to the evaluation of image 
classification and compression techniques. This report describes the evaluation 
criteria and procedures that have been proposed and develGpcd to focus attention 
on the existing state of the art and pi'ovide guidance for future researcli efforts. 
Although there are many criteria, e. g. costs, running times, co7nputer resources, 
etc. , that should be considered in evaluating techniques, the main empliasis of 
this rcpoi’t is concerned witli statistical )7crformancc. 

Assume that multispectral image data have been classil'ied using a 
particular technique to produce a classification map (CM) and tliat the CM has 
been overlayed with a digital version of a ground truth map (CTM) . The normal 
procedure is to produce a contingency table, such as shown in Table 1, and 
determine a percentage accuracy as a measure of the goodness of a classification 
technique. However, there w'ould appear to be considerable risk involved in 
judging the nierits of various classification techniques iiased upon this one 
niunber. Hence, one of the pur|X)ses of this reixji't is to mathematically explore 
the contingency table to dctenninc hovv mucii additional ininianation can Ijc 
extracted. However, it must also be kept in mind that the table only provides 
numerical results and contains relatively little information concerning the map 
producing abilities of the various classification techniques. The desired end 
result is that there will be a sufficient numljer of mathematical criteria that 
can be examined to ensure as much completeness in the evaluation as possible. 
Criteria and procedures similar to what is discussed in the report can also be 
adapted to evaluate compression and change detection analysis results. 

The contingency tables used in this report resulted from a cooperative 
evaluation of classification techniques which involved Marshall Space Flight 
Centei', Hunts\dlle, Alabama, and the Tennessee State Planning OfLce, Nash- 
ville, Tennessee. Landsat data from the Bald Knob, Tennessee, Quadrangle 
were used as a test site and four sets of seasonal data w'ere also included for 
multitemporal evaluation. All of the techniques discussed in lliis repoit are 


supervised techniques and all used the same training areas for the classification 
results. The five classification results that are discussed include tlie Gaussian 
Maximum Likelihood which was used on one season of data as well as all four 
seasons simultaneously, the Linear Classifier Model which was also used on 
one season as well as all four seasons, and the Density Slicing Classifier which 
was used on only one season of data. The Linear Classifier uses hyperialanes 
to separate feature categories, while the Density Slicing Method selects a 
channel of data as well as a class interval in that channel to separate feature 
categories. 

Section II describes contingency tables and tests derived from the tables 
in a general manner, and Section III describes die evaluation of the classification 
analysis results. Section IV describes a proposed approach for evaluating 
classification maps that reduces back to the normally used contingency table. 


TABLE 1. GENERAL 5 BY 5 CONTINGENCY TABLE 
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II. MATHEMATICAL DESCRIPTION OF EVALUATION CRITERIA 


Table 1 shows the general form of a 5 by 5 contingency table tliat is con- 
sistent in size witli the tables used in Section III. The table indicates that there 
are five categories on the GTM being compared with five categories on the CM. 
The elements n. . tell how many pixels in class j on the CM occur at the 

same locations as pixels in class i on the CTM. The symbol is the number 

of pixels belonging to category i on the GTM and is the number that is expected 
to be obtained from the classification results. The symbol o^ is tlie munber of 

pixels that were classified in category j on the CM or the number that is 
observed, which is usually diffei’Cnt from what is expected. Matliematically 
speaking 


v» 

h n. . 

J 


and 


o. = Sn. . 

J T 


( 1 ) 


The symbols tt are probabilities of occim’ences, 


7T. = 

e 1 




TT. 

O i 



and TT. . 

n 11 


n../N 
11 T 


( 2 ) 


where N is the total number of pixels. The symbois N , %o, and % I are 
X c 

the number of correctly classified pixels, the classification accuracy, and 

inventory accuracy, respectively. Tlicse are computed using the following 

relations: 


N = )!n.. , %c = lOOfN /N.„) - and Tl = 100 
c V 11 ' c T' 

1 


For the inventory accuracy, the number wrong is given by tlie summation 
of the absolute value differences, which lias to be divided by tv.'o. Tlie factor of 
two is necessary because if one pixel changes category two columns are affected 
on the contingency table and the pixels are in effect counted twice. Tne inventory 
accuracy can also be computed by choosing the smaller of e. or o., summing 

over the categories, and multlplj.'ing by 100/N^ which gives the same result. 
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Two other tables can be generated from the actual contingency table; 
however, it is not necessary to do so because the actual table already contains 
the information. These two tables will be discussed to illustrate the concepts of 
randomness and optimumness. 


The concept of randomness is illustrated using the maximum likelihood 
estimators. The likelihood of an observed sample of being picked from an 

assumed population, i. e. , e^ and o^ are given and remain constant under all 

conditions, is tantamount to replacing n. . with e.o,/N or rr.e. in the contin- 

i.j 1 j T 0 j 1 

gency table. The only other quantities that change in the table are N , the 

c 

number of correctly classified pixels, and %c. This result should hold true for 
any sample of size picked from an assumed population and should be a 

random or "worst case" classification accuracy that is expected. 


The optimum case classification accuracy that can be expected for a 
given inventory (e^ and o^ given) occurs when the classification accuracy 

equals the inventory accuracy. This is tantamount to replacing n . with the 

smaller of or on the diagonal, and the remaining n. .(i^tj) will either 

be zero or indetenninant. The only othei- quantities that are changed are again 

N and % c. 
c 

There are several statistical performance criteria that can now be calcu- 
lated from the contingency table and these are discussed as follows: 


1. The first criteria is the actual classification accuracy. The classi- 
fication accuracies for tlie random and optimum cases pro\ide upper 
and lower limits fur the accuracy range, and a percent of optimum 
accuracy can be computed for a teclinique as a measure of how well 
it performed versus how well it could have performed. 


The remaining criteria are concerned with chi-squared tests tliat are convenient 
to use because the table contains information related to what is expected and 
what is observed. The chi-squared tests and formulas for computing the chi- 
squared values relating to those tests are as follows: 


2 . 


Hyi^othesis: The distribution (o.) of the classification inventory 
agrees with the distribution (e.) of the ground tiutli inventory: 
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3. Hypothesis: The clisti’ibution (n. .} of the correctly classified pixels 

1) ^ 

agrees wdth the distribution of tlie groimd truth inventory: 


^2 N 
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/, n. ./ 7T. - N 

V i,x e 1 c 


(5) 


4. I-IyixotliGsis: The disti'ibution of tlic number of correctly and incor- 
rectly classified pixels is optimum with respect to the given inven- 
tory and without regard to class; 



These three chi-squared values should be as small as possible to satisfy the 
hypotheses, wliile the remaining chi-squared values to be discussed should be 
as large as possible so that the hypotheses will bo rejected. 


5. Hy^xothesis: The correctly classified pixels are randomly disti'ilxuted: 
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the actual and random case. 


(7) 

classified pixels for 
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6. Hypothesis: Each classification feature is randomly distributed 
among the ground truth features according to the classification 
inventory: 


, 1. y -hi 

e. V 7T. i 


J 0 J 


where i refers to the feature on the GTM and j z'efers to the 
feature on the CM. 

7. Hypothesis: Each ground truth feature is randomly distributed 
among the classification features according to the ground truth 
inventory: 


L y JLii 

' V 7T. 

J 1 e 1 


where i and j have the same meaning as in equation ( 8) . 

8. Hypothesis: The number of correctly and incorrectly classified 
pixels are randomly distributed without regard to class: 


^7 


J) /n. . - TT. p-A Z f 1 

^ y 1,1 o 1 ij ^ y 1,1 o 1 ij 

‘Yj IT. e. W - y. 7T. e, 

Voii Voii 


9. Hypothesis: The number of correctly and incorrectly classified 
pixels for a particular class are randomly distributed: 
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where j represents the class. 
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10 . 


Hypothesis: The distribution ol' the classified pixels is independent 
of the ground ti’gth: 
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(12) 


11. The final criterion is the coefficient of contingency, which is similar 

2 

to a corx’elation coefficient and is calculated from x • The coeffi- 
cient is given by 
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2 

^8 


11/2 


“ 1 ) 


(13) 


where k is either the number of features on the GTM or CM, 
whichever is smaller. 

For relatively comparing various classification techniques, the best 
values observed for all tlie chi-squared tests can be chosen as the expected chi- 
squared values. Tlie actual observed chi-squared values for a particular tech- 
nique can then be measured against what is expected by cojnputing chi -squared 
values. The use of these criteria is illustrated in the next section. 


111. EVALUATION RESULTS 


Tables 2 througii G are the contingency tables for the various techniques 
being examined, and u, t, a, f, and w are the feature categories urban, trans- 
portation, agriculture, forest, and water, respectively. The techniques are 
identified by the labels: 

MLCM — Maximum Likeiiliood Classifier (Map) 

LCM — Linear Classifier (Map) 

MLMCM — Maximum Liicelihood Multitemporal Classifier (Map) 

LMCM — Linear Multitemporal Classifier (Map) 

DSCM — Density Slicing Classifier (Map) 


All of the classification programs are supervised techniques, and all 
programs were supplied the same training areas. The multitemporal programs 
used 16 channels of seasonal data rather than one season containing only 4 
channels. Thus, all of the results have one season of data in common. 

TABLE 2. CONTINGENCY TABLE FOR GTM VERSUS MLCM 
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w 
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40 
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.5639 
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Table 7 lists the statistical criteria as a function of classification tech- 
nique, and the numbers followed by an asterislc indicate the best numbers that 
were observed. The degrees of freedom (df) associated w'ith each chi-squared 
value is also listed in Table 7. Of Uie 26 possible best numbei's, MLCM has 6 
of them, LCM has 3, MLMCM has 11, LMCM has 4, and DSCM has 2, By using 
the numbers followed by an asterisk as expected values, a chi-squax*ed value can 
be computed for each technique that has n-1 or 25 df. These chi-squared 
values are listed in Table B. 

Tables 2 through 8 represent a consioera.ble amount of information that 
needs an equal amount of discussion. First, for 1 df there is a 0. 05 probability 
of finding a chi-squared value larger than 3. 841 and a 0. 01 probability of finding 
a value larger than 6. 635. For 4 df Uie 0. 05 and 0. 01 chi-squared values are 
9. 488 and 13. 277; for 16 df die 0. 05 and O. 01 values are 26. 296 and 32. 0; and 
for 25 df the 0. 05 and 0.01 values are 37. 652 and 44. 314. Using these values 
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TABLE 5. CONTINGENCY TABLE FOR GTM VERSUS LMCM 
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table 6. CONTINGENCY TABLE FOE GTM VERSUS DSCM 
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TABLE 8. TECHNIQUE VERSUS CHI-SQUARED VALUES 
USING TABLE 7 DATA 


Te Clinique 

MLCM 

LCM 

MLMCM 

LMCM 

DSCM 

Clii-squared 

48950 

5100 

1597 

6409 

207832 


and examining Tables 7 and 8 show tba}: every single hypothesis was rejected 
and hardly any of the chi-squared values are even close to these numbers. An 
attempt was made to understand why the chi-squared values are so large by 
using the inventory from MLCM and computing the chi-squared values as a 
function of inventory accuracy. Equation ( 3) shows that the proportion of 
wrongly classified pixels for each category j is given by 


e. - o. 
_J 1 


(14) 


If it assunied that these proportions remain constant for any inventory 
accuracy, then Table 9 shows the inventory and chi-sqaared which result from 
this assunrption. 

TABLE 9. INVENTORY ACCURACY VERSUS CHI -SQUARED 
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Thus, it is not possible to accept the hypothesis that the distribution of the 
classification inventory is statistically signiflcant when compared with the ground 
truth inventory even though tlie inventory is 99.5 percent correct. If the optimum 
classification accuracy is considered, then tlie chi-squared value is almost sig- 
nificant at 95 pei’cent classification accuracy. Hence, it appears that the chi- 
squared tests are extremely strict, but because of this it also appears to be 
extremely good at relatively discriminating between the performance of various 
techniques. 

Tables 7 and 8 show that different conclusions would be obtained if the 
techniques were judged on classification accuracy only versus a set of criteria. 
Presumably, the set of criteria provides for better judgment because it offers 
a more complete description of performance. 

In Table 7, shows that MLMCM benefited the most from the use of 
multitemporal data even through the classification accuracv increased less than 
2 percent and the inventory accui'acy less than 1 percent. This indicates that 
the inventory distribution improved considerably, and the inventory has to be 
relied on when there are no gromid truth results. The inventory accuracy is 
usually higher than the classification accuracy because the misc] assified pixels 
tend to cancel out not having classified enough pixels correctly. 

The values for show tliat the correctly classified pixels are better 
estimators of the ground truth inventory distribution than the classification 
inventory. Hence the error-cancelling effect of the correctly and incorrectly 
classifiea pix-.ls is not all that good. The values for x| also show that the 
distribution jf correctly and incorrectly classified pixels is nowiiere near 
optimum, nut xf shows that they are closer to being optimally distributed than 
randomly di -tributed. The values for x| also show tliat the correctly classified 
pixels are closer to being optimally distributed than randomly distributed. 

The values for xi show that each feature is not randomly classified, 
although the categories urban and transportation ai'e highly suspect. In all 
cases, the agriculture category is the least randomly classified even though it 
is not the most accurately classified ox* largest category. The values for x| 
show that the ground truth categoi’y transportation is highly suspect of I'andomly 
occurring in places classified as other categories. Tliis test was used primarily 
to determine if the number of misclassified pixels for a particular categoxry 
were distributed or proportional to the population of the other ground truth 
categories. The values of x| indicate that the number of correctly and incor- 
rectly classified pixels for each category are not randomly distributed, but 
again the urban and transportation categories are suspect. 
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The values for Xs and %c show that the contingency table distribution 
does not indicate independence of the ground truth and classification results, but 
a 45 percent "correlation*' is nothing to be proud of either. Hence, it appears 
that the classification performance was rather dismal for this test site. Although 
Table 8 indicates that hiLMCM had the best performance, the chi-squared value 
is still too large when measured against the best possilsle performance of all the 
techniques. 

There may be several reasons why the performance of the techniques is 
lower than expected. The first is that the best season may have not been chosen 
for those techniques that used only one set of data. Secondly, the test site is 
rather small (3338G pixels) as test sites go. The observation was made that 
the majority of classification errors occurred at the lx)undary of two or more 
different features and that the homogeneous areas were classified consistently 
accurately. Hence, if tlie test site had been expanded, it is expected that the 
misclassification woidd increase linearly and correct classification would 
5£ crease proportionally to the area. Also better choices of training areas 
v.ould probably be available. Exi^anding tlm site size would also provide a means 
of checldng the stability of the statistics calculated fcr the 3338G pixel test site. 

The discussion of these evaluation criteria and results provides a means 
of establishing a statistical base for determining the perfoimance of various 
classification techniques on different types of data sets and for various remote 
sensing discipline applications. However, tliese criteria provide relatively little 
information concerning the goodness of a CM. The tabular results pro^dde 
information only on how many of each category, whereas a map also provides 
this information as well as where this information is located. The next section 
addresses modification of the contingency table to provide information on tlie 
spatial complexity of the test site, on where misclassification errors occur, and 
on how well the CM agree with the GTM. 


IV. PROPOSED STATISTICAL TESTS FOR EVALUATING 

CM AND/OR GTM 


Although the previous tests contain relatively little information on the 
goodness of maps produced by vaadous classification teclmiques, the tests can 
be adapted to provide some measure of map goodness. One possible clue as to 
what approach should be taicen to adapt these tests is that CM with identical 
inventory and classification accuracies can appear quite different visually. 
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Thus, for two such CM, the best choice would appear to be to select the map 
whose homogeneous areas and boundaries coincide best with the GTM homogeneous 
areas and boundaries, A pi’oposed quantitative approacli to making this selection 
is to produce an 8 by 8 contingency matrix to replace each individual element in 
the contingency table of GTM versus CM. The model used to provide numbers 
for the contingency matrix in the contingency table is as follows. 


Let x.. be the reference sample on tlie CM and y.. Idc the reference 

ij 

sample on the GTM at scan i and column i. Let x and x be two 

1-1.] i.j-1 

test samples adjacent to the reference sample on the CM at scans and columns 
i-l,j and i,j-l, respectively, and let y._ . and y. . be the corresponding 

test samples on the GTM. ^ 


Several comparisons can be made between the reference and test samples 

on either the GTM or CM and between corresponding samples on the GTM and 

CM. For example, a vertical (hox’izontal) boundary would be indicated on the 

CM if pixel x. . belongs to a different class than pixel x. (x ). The 
i.J i,]-l 1-1, 

same is true for tlie GTM if x is replaced by y . A homogeneous pixel area 

occurs when x. x. and x. . belong to the same class on the CM. 

The same is also true for the GTM if x is replaced by y . A double boundary 
occurs when the reference sample disagrees with both test samples on either 
the CM or GTM. Comparisons also have to be made between tlie GTM and CM 
to determine how many agreements there are concerning the three corresponding 
pixels. In constructing the 8 by 8 matrix, the upper half will contain entries 
when the reference samples belong to the same class on tlue GTM and CM, the 
lower half will contain entries when the reference samples disagree on the GTM 
and CM, the left half of the matrix will contain entries when either or both of 
the test samples agree on the GTM and CM, and the right half of the matrix will 
contain entries when either or both of die test samples disagree on die GTM and 
CM. A pictoi'ial description of die 8 by 8 contingency matrix is shown in the 
Figure and an explanation of the colmnn and row labeling, as well as the entry 
values follow's. 


The row or GTM label definitions are: 

1.1— The reference samples agree on GTM and CM. There is no feature change 
in either the scan or column direction (liomogeneous pixel area). 

1, 2 — The reference samples agree on GTM and CM. There is a feature change 
in the scan direction only (vertical boundary) . 
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Figure. 8 by 8 contingency matrix. 


1. 3 — Tbe reference samples agree on the GTM and CM. There is a feature 

change in the column direction only (horizontal boundary) . 

1. 4 — The reference samples agree on the GTM and CM. There is a feature 
change in the scan and column directions (double bomidary) . 

The row definitions of 2. 1, 2. 2, 2. 3, and 2. 4 are identical to the above, 
except that die reference samples on the GTM and CM disagree. The colmnn or 
CM labels 1. 1, 1, 2, 1. 3, 1. 4 are identical to those pre\4ously defined, but 2. 1, 

2. 2, 2. 3, 2. 4 refer to test sample disagreements. The entry values in the 
matrix range from zero to three. The left half of the matrix contains the munber 
of agreements on the GTM and CM conceding die three pixel locations, and tlie 
right half contains the nmnber of disagreements. For example, every time a 

1. 1 condition is encountered on die GTM and Ch'I, all three pixels are in agree- 
ment and a three is added to the simi (which is initially zero for all elements) 
contained in matrix element 1. 1, 1. 1. Notice that 1. 1, 2. 1 and 2. 1, 1. 1 are 
impossible situations and always contain zero. In the case where 1. 4, 1. 4 is 
encomitered, there can be 3, 2, or 1 agreements which are added to the svmi In 
matrix element 1.4, 1. 4 and there can be 0, 1, or 2 chs agreements, respectively, 
which are added to the smii of matrix element 1. 4, 2. 4. 
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To construct the contingency table using the 8 by 8 contingency matrix, it 
is necessary to use only half of tlie 8 by 8 matrix for each table element. Thus, 
for the diagonal elements of the table, only tlie upper half of the matrix is used 
because tlie bottom half will be all zeros. For the off-diagonal elements of the 
table, only tlie lower half of the matrix is used because the top half will be eiU 
zeros. Hence, each element in the contingency table is replaced by a 4 by 8 
contingency matrix. Notice that tlie original values of the single element con- 
tingency table can be obtained by adding the right half of the 4 by 8 matrix to the 
left half for each table entry, computing the sum of all of the elements of the 
resulting 4 by 4 matrix, and dividing by three. Therefore, the contingency matrix 
not only contains the same information as the contingency table, but it also con- 
tains a considerable amount of information related to tlie structure of the CM. 

There are several types of map structure information that can be 
obtained from the 4 by 8 contingency matrices. By adding the right half of the 
4 by 8 matrix to the left half and dividing all of the elements of the resulting 
4 by 4 matrix by tliree, the 4 by 4 matrix will contain the number of homogeneous 
pixels, vertical boundaries, horizontal boundaries, and double boundaries on the 
diagonal elements for correctly classified pixels- The off-diagonal elements of 
the matrix contain the number of errors whei’e feature changes occurred on tlie 
CM, but did not occur on the GTM or vice versa. Preidous work done on 
identifying the major source of classification errors has indicated that the 
majority of misclassification occurs at a boundary between two or more 
different features. The matrix will help narrow down what type of boundaries 
produce the most errors. By not adding tlie right half to the left half of the 4 by 
8 n.'atrix, it is possible to determine for tliose elements having only two possible 
values, the number of events having each value. This is not possible with the 
matrix elements diat can have three values. 

By comparing a GTM ivitli itself, the contingencj'^ table will contain only 
diagonal elements and these diagonal elements will contain 4 by 4 matrices 
(which are the upper left quarter of the original 8 by 8 matrices) that are them- 
selves diagonal. These 4 by 4 diagonal matrices pro\ide a means of determining 
the spatial complexity of each feature in terms of the niunber of observed homo- 
geneous pixels and various tyjjes of pixel boundaries. By adding all of the 4 by 4 
diagonal matrices, a general measure of spatial complexity can be obtained for 
the entire GTM independent of feature. These measures are tlie expected dis- 
tributions that can be used in various tests for comparing tlie CM (observed 
distributions) v/ith the GTM to determine how well the spatial complexities 
agree. Thus, the comparing of spatial complexities provide a means of selecting 
the best CM from several maps that nave similar inventory and classification 
accuracies. Comparisons can also be made between the various CM as well as 
comparing a CM with itself, if that type of information is desired. 
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Table 10 shows the contingency matrix for comparing the GTM \vith 
itself. For the urban category, which contains 276 pixels, there were 66 urban 
pixels (23. 91 percent of tlie urban pixels) that had an urban pixel directly above 
it (previous scan, same column) and an urban pixel directly to the left of it 
(same scan, previous column) . There were also 29 urban pixels (lO. 5 percent) 
that had an urban pixel directly above it and no urban pixel directly to the left 
(vertical boundary) , 23 urban pixels (8, 33 percent) tiaat had an ui'ban pixel 
directly to tlie left of it and no urban pixel directly above (horizontal boundary) , 
and 158 urban pixels (57. 24 percent) tliat had no urban pixels directly above or 
to the left (double boundary) . In describing the features on the GTM, it could 
be said that the i;*ban feature is 23. 9 percent homogeneous, transportation is 
4 percent hoir ^eneous, agriculture is 77. 5 percent homogeneous, forest is 
73. 1 percp' nomogeneous, and water is 18. 3 percent homogeneous. There is 
a corref . jndence between the homogeneity of a feature and the feature classifi- 
cation accuracy in that the more homogeneous features appear to be more 
accurately classified. 

TABLE 10. GTM/GTM CONTINGENCY MATRIX FOR EACH FEATURE 



Table 11 shows the contingency matrix for all of the GTM features 
combined. 
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TABLE 11. GTM/GTM CONTINGENCY ?«IATRIX FOR ALL FEATURES 



The table indicates that the entire map is 71. 1 percen>, liomogeneous, 
which corresponds very closely with the classification accuracies presented in 
Tables 2 through G. Thus, it appears that the homogeneity percentage for the 
GTM could be used as a good estimate of expected minimum classification 
accuracy. Table 11 also indicates that it may be worthwhile to consider using 
spatial information in tlie classifier because 91 pei'cent of tlie pixels belong to 
the same feature as the previous pixel in the sajne scan or same column. 

Table 12 shows the contingency matrix of MLCM/GTM foi' each feature. 
The diagonal and row percentages for correct classification are obtained by 
ratioing the diagonal elements and row sums of tlie diagomil matrices in Table 
12 with the elements in Table 10. For forest, tlie diagonal percentages show 
that for 64. 2 percent of the time, the reference pixel was corrected classified 
when the previous pixel in the same scan and same cohuiin were also correctly 
classified as forest. For the case where the reference pixel and the previous 
pixel in the same column were correctly classified as forest, but the previous 
pixel was categorized as lielonging to another feature, the success was only 
IG. 8 percent. In the case of a horizontal boundary for forest, the success in 
correct classification was only 13. 5 percent, and for the case of a double 
boundary for forest tlie success was only 11.4 jjercent. This situation seems 
to be tyiiical for large hoiiiogeneous areas, indicating tliat the interior pixels 
tend to be more correctly classified than the transition or boundary pixels 
bervveen two or more features. If the constraint is removed that the previous 
pixels in the same scan and same coliunn on the CM iiave to agree with the class 
configuration of the corresponding pixels on the GTM , then tlie row percentages 
show that for forest and \^?hcn there is no feature change in tlie prevdous pixels 
on the GTM, the reference pixel on the CM is correctly classified 8G. 2 percent 
of the time. 

The situation appears to be different for highly linear features such as 
transportation/communication (t) . In this case, the diagonal and row percent- 
ages are higher when a feature change is present in the previous pixels. This 
is probably due to high data contrast between roadways and power line right of 
ways versus forested areas. 
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TABLE 12. MLCM/GTM CONTINGENCY MATRIX FOR EACH FEATURE 



ORIGINAL PAGE IS 
OF POOR QUALITY 


Per cori-eot olossiHcatlons only. 
















































It also appears that the effect of banding can be observed by examining 
the diagonal and row pei'centages change for the 1. 2 and 1. 3 cases. If the class 
configuration is preserved on the CM and GTM ( diagonal percent) , tlie classi- 
fication accuracy is higher for 1.2 (vertical boimdary) . However, if the class 
configuration is ignored on the CM, the classification accuracy is liigher for 
1. 3 (horizontal boundary on GTM) . Both situations are supported by the fact 
that banding is obsexwed as a liorizontal phenomenon produced by data changes 
in the vertical direction. 

Table 13 is a summary of the infoimation in Table 12 for all features. 
The diagonal and row percentages were obtained by ratioing the diagonal 
elements and rov/ sums of Table 13 mth the elements of Table 11. The total 
diagonal percentage was obtained by I'atioing the sum of the diagonal elements 
ir Table 13 with the smn of the diagonal elements in Table 11. The diagonal 
and row percentages indicate essentially the same results as previously 
menti'”,ied. However, it is interesting to compare thi-ee tyjoes of classification 
accuracy based upon different constraints. For MLCM Table 2 shows that if 
the total nimiber of pixels for each featur'e on the CM (regardless of where they 
occur) are compared with the total number of pixels for each feature on the 
GTM, then the inventory accuracy is 81. 94 percent. If the constraint is added 
tliat the CM features pixels are correct if they agree witli the GTM feature 
pixels at the same location, then the classification accui'acy is 71. G7 percent. 

If a constraint is added that feature changes on CM and GTM have to agree 
together witli the correctly classified pixels, then a measure of the map 
accuracy is 45. 77 percent as indicated by the total diagonal percentage in 
Table 13. 


TABLE 13. MLCM/GTM CONTINGENCY MATRIX FOR ALL FEATURES 
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1.1 
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Total Diagonal Percentage 

45.77 
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